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ABSTRACT 

This  paper  deals  with  a visual-motion  fixation  invariant.  We  show  that  during  fixation  there  is  a 
measurahle  nonlinear  function  of  optical  flow  that  produces  the  same  value  for  all  points  of  a 
stationary  environment  regardless  of  the  3-D  shape  of  the  environment.  During  fixated  camera  motion 
relative  to  a rigid  object,  e.g.,  a stationary  environment,  the  projection  of  the  fixated  point  remains  (by 
definition)  at  the  same  location  in  the  image,  and  all  other  points  located  on  the  3-D  rigid  object  can 
only  rotate  relative  to  the  3-D  fixation  point.  This  rotation  rate  of  the  points  is  invariant  for  all  points 
that  lie  on  the  particular  environment,  and  it  is  measurable  from  a sequence  of  images.  This  new 
invariant  is  obtained  from  a set  of  monocular  images,  and  is  expressed  explicitly  as  a closed  form 
solution.  In  this  paper  we  show  how  to  extract  this  invariant  analytically  from  a sequence  of  images 
using  optical  flow  information,  and  we  present  results  obtained  from  real  data  experiments. 


This  work  was  supported  in  part  by  a grant  from  the  National  Science  Foundation,  Division  of  Information, 
Robotics  and  Intelligent  Systems,  Grant  # IRI-91 15939 


1:  Introduction 

During  eye  motion  in  a stationary  environment,  the  projection  of  objects  in  the  eye  is 
continuously  changing,  yet  we  perceive  the  environment  as  stationary.  Are  there  properties  of  the  image 
that  under  some  transformation  remain  constant  during  relative  motion  between  the  eye  and  the 
environment?  In  other  words,  are  there  visual-motion  invariants  [1],[2]? 

Gibson  [3]  captured  this  idea  as  follows:  "If  invariants  of  the  energy  flux  at  the  receptors  of  an 
organism  exist  and  if  these  invariants  correspond  to  the  permanent  properties  of  the  environment,  and 
if  they  are  the  basis  of  the  organism's  perception  of  the  environment 

instead  of  the  sensory  data  on  which  we  have  thought  it  based,  then  I think  there  is  new  support  for 
realism  in  epistemology  as  well  as  for  a new  theory  of perception  in  psychology. " 

In  this  paper  we  derive  an  optical-flow-based  invariant  that  exists  during  the  process  of  camera 
fixating  at  a point  located  on  a rigid  body  (e.g.,  a stationary  environment).  It  is  a new  and  collective 
representation  of  3-D  points  that  shows  that  during  fixation  there  is  a measurable  nonlinear  function  of 
optical  flow  that  produces  the  same  value  for  all  points  of  a rigid  3-D  environment  regardless  of  the 
structure  of  the  environment.  In  other  words,  this  is  a scene  independent  visual  motion  invariant. 
Note  that  this  invariant  exists  only  when  motion  is  presented,  as  opposed  to  invariants  that  exist  in  still 
images  (see  [7]-[9]). 

A very  basic  observation  led  us  to  the  derivation  of  this  invariant.  During  the  fixation  process 
[4],  any  camera  motion  can  be  described  as  camera  rotation  and  translation  relative  to  the  3-D  fixation 
point  (Figure  1).  This  motion  can  also  be  described  as  translation  of  the  camera  along  the  camera— 
fixation-point  line,  and  rotation  of  the  stationary  environment  relative  to  the  3-D  fixation  point.  In 
other  words,  during  the  fixation  process  points  on  the  3-D  rigid  environment  cannot  translate  relative  to 
the  fixation  point,  yet  they  can  rotate  relative  to  that  point.  This  rate  of  rotation  (angular  velocity)  is 
the  same  (i.e.,  invariant)  for  all  points  that  lie  on  the  3-D  stationary  environment  at  any  given  instant  of 
time,  and  it  is  measurable  from  a sequence  of  images.  We  show  how  to  extract  this  invariant 
analytically  from  a sequence  of  images  using  the  optical  flow  information,  and  we  also  present  the 
results  obtained  from  real  data  experiments  using  a theodolite  and  a CCD  video  camera. 


camera 

path 


F = Fixation  point  and 

camera  axis  of  rotation 


Figure  1:  Camera's  motion  relative  to  the  fixation  point. 


Some  of  the  prominent  features  of  this  invariant  are: 

1.  It  is  a new  collective  representation  of  3-D  points. 

2.  Only  one  camera  is  needed  to  extract  this  invariant. 

3.  Camera  motion  need  not  be  known. 


4.  It  exists  during  fixation,  and  the  fixation  point  can  be  chosen  arbitrarily. 

5.  It  is  a measurable,  nonlinear  function  of  optical  flow,  i.e.,  it  is  measured  from  visual  data  in  camera 
coordinates. 

6.  It  produces  the  same  value  for  all  points  of  a 3-D  rigid  object  regardless  of  its  3-D  shape,  i.e.,  it  is 
valid  for  any  structure  in  the  stationary  environment. 

7.  There  is  no  need  for  3-D  reconstruction. 

8.  The  invariant  is  obtained  from  a closed  form  expression,  and  it  is  measured  in  time  units  rather  than 
distance  units. 

9.  Using  a logarithmic  retina,  this  invariant  can  be  obtained  directly,  i.e.,  without  many  additional 
computations. 

2:  A fixation  invariant 

The  following  assumptions  are  made  in  the  derivation  of  the  invariant:  a)  The  camera  axis  of 
rotation  is  perpendicular  to  the  line  that  connects  the  fixation  point  with  the  pinhole  point  of  the 
camera,  b)  There  is  no  relative  translation  between  the  fixation  point  and  the  camera,  c)  The  angular 
velocity  of  the  camera  is  constant,  d)  The  angles  extracted  from  the  image  are  approximately  equal  to 
the  angles  extracted  from  an  orthographic  projection  plane.  Practically,  in  order  to  achieve  similar 
projection,  a narrow  field  of  view  camera  should  be  used,  and  the  observed  object  has  to  be  placed  far 
away  from  the  camera. 

Figure  1 shows  a moving  camera  fixated  on  a stationary  object.  Without  loss  of  generality,  we 
chose  to  derive  the  invariant  for  a stationaiy  camera  fixated  on  a point  located  on  a rotating  3-D  object 
(Figure  2).  This  choice  simplifies  the  derivation  of  the  invariant,  as  well  as  the  experimental  set  up. 

In  Figure  2 points  A,  B and  F are  points  on  the  3-D  object,  where  point  F marks  the  3-D 
fixation  point.  Note  that  points  A,  B and  F need  not  be  on  the  same  horizontal  plane.  Points  F',  A'  and 
B'  are  the  projections  of  points  F,  A and  B on  the  projection  plane,  respectively,  and  points  F*,  A*  and 
B*  are  the  projections  of  points  F',  A and  B'  on  the  image  plane,  respectively.  The  axis  of  rotation  of 
the  object  is  perpendicular  to  the  page  . Note  that  for  the  chosen  projection  the  following  derivation  is 
independent  of  the  distance  I between  F*-F'. 


Figure  2:  Angles  extracted  from  projection  plane. 


From  Figure  2,  for  any  point  on  the  rigid  body,  (assuming  and  using  ^=/3(o  and  the 

following  relationship  holds: 


tan^  = 


rcosO 

I 


Taking  the  derivative  of  both  sides  of  Equation  (1)  with  respect  to  time,  we  get: 


cos  P I 


= -^cosin6 


(1) 


(2) 


die  . 


where  the  dot  above  the  variable  represents  derivative  with  respect  to  time,  and  = ~ is  the  unknown 

angular  velocity  of  the  rigid  object  about  the  fixation  point,  which  is  unknown.  Dividing  Equation  (2) 
by  Equation  (1)  and  after  manipulating  the  results  we  obtain: 

P 


= -cD'Xaxid 


Therefore, 


cos^  p’  imp 


6>=tan~^(  ^ 


cDsinilp) 

Specifically,  for  two  visible  points  A and  B at  time  instant  t,  we  can  write: 


and 


e^it)  = tan-‘( — ) 

asin[2/?^(0] 


(3) 


(4) 


(5) 


<»sin[2/?e(01  ^ (6) 

where  subscripts  A and  B correspond  respectively  to  points  A and  B of  the  rigid  object  respectively  at 
time  instant  t.  Subtracting  Equation  (6)  from  Equation  (5)  at  time  instants  and  X2  yield  Equations 
(7)  and  (8)  respectively: 


^^4(^1)  — tan 


6)sin(2>9^rij)) 


tan 


-1 


(osmilpsit^)) 


0) 


0A(t2)-0B(t2)  = ^ 2pA(t2) 

0)sin(2pA(t2)) 


tan 


-1  2PB(t2) 


cDsin(2PB(t2)) 

Since  is  constant  at  all  time  instants,  one  can  equate  Equations  (7)  and  (8)  to  get: 


tan 


-1 


(os\n[ip^{t^)] 


-tan 


-1 


cos\n{2PB{t^)\ 


= tan 


-1 


'^PAit2) 


(osinlip^it^)] 


-tan 


-1 
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Using  the  identity 


, , ^ tana-tan>9 

lm{a-p)  = — — 

1 + tanatan^ 


(8) 

(9) 

(10) 


Equation  (9)  can  be  written  as  follows: 


where: 


0)  0) 

(11) 

■ sin|2A^(r,)] 

(12) 

sinI2>95(t,)l 

(13) 

^ 2M2) 

(14) 

3 2Mt2) 

^ SiT)[2^s(t2)] 

(15) 

Manipulations  of  Equation  (11)  yield  the  following: 


2 _ ^1^(02  -^2)-«2^2(^l  -^) 
(0  — ” 

(Oj  — ) — (^2  ~ ^2  ) 


(16) 


Note  that  <3r, , , a2  and  \ have  to  be  finite  numbers,  and  (a^  - A, ) (^2  - ^2 ) • 

The  meaning  of  Equation  (16)  is  that  during  fixation  there  is  a measurable  nonlinear  function  of 
optical  flow  that  produces  the  same  value  (invariant)  for  all  points  of  a stationary  environment 
regardless  of  the  3-D  structure  of  the  environment.  This  invariant  is  co.  Note  that  the  solution  does 
not  distinguish  between  clockwise  and  counter-clockwise  rotations. 

An  interesting  fact  arises  from  the  derivation  of  the  invariant.  The  expressions  in  Equations 
(12)-(15)  can  be  obtained  linearly  using  a logarithmic  retina,  since  the  variables  a\,  a2,b\  and  Z>2  are 
derivatives  of  a logarithmic  expressions.  Specifically,  we  have 


sin(2>0)  dt 


(17) 


i.e.,  these  coefficients  can  be  obtained  directly  from  a linear  change  of  visual  data  as  measured  by  a 
logarithmic  retina  [6]. 


3:  The  experiment 

In  this  section  we  describe:  1)  Simulations  used  to  verify  the  result  of  the  derivation  and  to 
test  the  theoretical  limitations  of  the  method.  2)  Real  data  measurements  using  a Theodolite  to  test 
some  basic  practical  aspects  of  the  approach.  3)  Real  data  measurements  using  a CCD  video  camera. 


3.1:  Simulation 

We  used  computer  simulations  to  verify  the  result  in  Equation  (16),  and  to  test  the  theoretical 
limitations  of  the  method.  The  location  of  two  points  on  an  arbitrary  rotating  object  were  simulated, 
and  the  angular  velocity  of  the  object  was  obtained  using  Equations  (12)-(16).  The  points  on  the  object 
(A  and  B)  were  selected  arbitrarily  with  random  radii  (r^  and  r3),  and  with  random  initial  phase  {9p^ 
and  %).  During  the  simulated  motion  the  program  selects  four  consecutive  time  instants  (t\,  t2,  t3  and 
t4),  where  At=(t2-ti)=(t4-t3)«(t3-t2).  At  each  one  of  these  four  time  instants  the  program  obtained 


the  spatial  location  of  the  points  A and  B,  and  using  simple  vector  analysis  it  calculated  four  angles; 
PpSM)  >%(t3),  and  their  approximated  derivatives: 


PaKx)- 

(18) 

(19) 

(20) 

(21) 

These  angles  and  their  derivatives  were  substituted  in  Equations  (12)-(15)  to  obtain  the  final  result 
which  is  the  angular  velocity  of  the  rotating  object. 

We  varied  some  simulation  factors  such  as:  (t3-t2),  coAt,  /,  the  ratio  (r^/rg),  the  ratio  (At/©)  and 
the  ratio  (0i/02)-  Table  1 shows  the  effect  of  different  values  of  At  on  the  results  for  angular  velocity 
of  10  rad/s.  Values  of  At  vary  from  10“^  to  10"^  seconds,  and  as  expected  the  error  increases  as  At 
increases  with  larger  At.  However,  even  for  the  case  of  A/ = 0.1  seconds,  i.e.,  o)-^t=  1 radian,  the  error 
in  the  result  is  only  about  4%. 

These  various  tests  verified  the  theoretical  results,  and  showed  some  limitations  of  the  method. 
There  are  three  situations  that  may  result  in  large  errors  in  the  analysis  of  the  angular  velocity: 

(a)  = where  w = ±1,±2,±3,---. 

In  this  case,  points  A,  B appear  at  opposite  locations  relative  to  the  rotation  axis,  and  therefore  no 
new  information  about  the  motion  of  the  object  is  gained. 

(b)  0^+6^-  n-vlnn^  where  n = ±1,±2,±3,--*. 

This  causes  a singularity  point  in  Equation  (16). 

(c)  At  is  not  "small  enough".  This  results  in  an  error  in  estimating  the  optical  flow  dp! dt. 

3.2:  Theodolite  measurements 

In  this  part  of  the  experiment  a theodolite  is  used  to  measure  the  p angles.  Figure  3 shows  an 
object  held  by  a robot.  Points  A,  B and  F were  marked  on  the  object  by  the  tip  of  pins  that  were 
inserted  into  the  object  like  in  Figure  4,  and  the  theodolite  was  located  about  1.2  meters  away  from  the 
axis  of  rotation. 

During  the  experiment,  the  robot  rotated  the  object  to  four  different  orientations.  The  p angles, 
viewed  by  the  theodolite,  were  recorded  at  each  orientation  to  yield  the  data  that  is  necessary  for  the 
analysis  of  the  angular  velocity,  i.e.,  angles  /fefe).  /l4(^4).  and 

y%(4)- 

Note  that  there  is  no  real  time  dimension  in  this  experiment  since  we  stop  the  robot  for  every 
measurement.  If  one  assumes  that  the  angular  velocity  is  ©,  then  the  time  dimension  can  be  obtained  by 

calculating  At  using  At  = AG/® . p can  be  approximated  using  the  angles  extracted  from  the  theodolite 
divide  by  At  as  in  Equations  (18)-(21).  Using  these  equations  the  © obtained  from  the  visual  data  and 
the  calculated  At,  can  be  compared  with  the  assumed  angular  velocity. 


Figure  3:  The  theodolite  experimental  setup 


Figure  4:  The  object  used  in  the  theodolite  experiment. 

The  real  data  results  show  that  the  method  is  quite  robust.  For  24  different  experiments  the 
average  error  in  co  is  7.4%  as  shown  in  Table  2. 

3.3:  Camera  measurements 

In  this  part  of  the  experiment  we  used  the  suggested  method  to  extract  the  angular  velocity  of  a 
rotated  object  using  a CCD  video  camera.  The  experimental  set-up  is  shown  in  Figure  5. 

The  following  are  some  details  of  the  experimental  set-up:  The  distance  between  camera  and  the 
fixation  point  is  150  cm,  and  the  camera's  field  of  view  is  about  9 degrees.  The  object  has  a minimum 
radius  of  22  cm.  In  order  to  simplify  the  image  processing,  the  object  has  well  defined  dark  vertical 
strips  located  on  a white  background,  for  a high  contrast,  as  shown  in  Figure  6.  The  object  is  rotated  at 
an  angular  velocity  of  0.021  rad/s. 

During  the  experimental  process  we  found  some  practical  limitations:  (a)  The  value  of  ©At 
should  be  large  enough  so  that  the  point  being  tracked  passes  through  a "sufficient  number  of  pixels"  to 
decrease  the  effect  of  the  discretization  error,  yet  theoretical  limitations  impose  an  upper  bound  on  At. 


(b)  Points  that  produce  large  value  of  optical  flow  cause  larger  errors  in  the  computed  co.  This  is  due  to 
the  fact  that  these  points  produce  little  change  in  optical  flow. 


Figure  5:  Camera  measurements  experimental  setup 


Figure  6:  The  object  used  in  the  camera  experiment. 

Under  these  experimental  conditions  the  average  error  in  o)  over  24  experiments  is  about  3.5% 
as  shown  in  Table  3.  In  every  experiment  the  processing  was  done  over  20  different  sets  of  points 
simultaneously,  and  the  results  were  averaged.  The  real  angular  velocity  of  the  robot  was  0.021  rad/s. 


CO  = 

10.0 

10.0 

10.0 

10.0 

10.0 

10.0 

10.0 

10.0 

At  = 

0.0000 

1 

0.0000 

1 

0.001 

0.001 

0.01 

0.01 

0.1 

0.1 

Result 

10.000 

5 

10.000 

5 

10.043 

1 

10.038 

7 

9.9753 

9.9493 

10.274 

6 

10.434 

4 

Table  1:  The  effect  of  At  on  the  error  in  co.  True  o is  10  rad/s. 


Exp. 

# 

1 

2 

3 

4 

5 

6 

7 

8 

Resul 

t 

1.006 

1.010 

0.956 

1.092 

1.016 

1.140 

0.882 

1.225 

Exp. 

# 

9 

10 

11 

12 

13 

14 

15 

16 

Resul 

t 

1.103 

0.958 

1.002 

0.845 

1.059 

0.989 

0.945 

0.953 

Exp. 

# 

17 

18 

19 

20 

21 

22 

23 

24 

Resul 

t 

1.005 

0.997 

1.007 

1.446 

0.979 

0.966 

1.071 

0.978 

Table  2;  Results  of  co  using  data  obtained  from  theodolite.  True  co  is  1 rad/s. 


Exp. 

# 

1 

2 

3 

4 

5 

6 

7 

8 

Resul 

0.020 

0.021 

0.022 

0.021 

0.022 

0.019 

0.022 

0.022 

t 

3 

0 

4 

3 

8 

5 

6 

5 

Exp. 

# 

9 

10 

11 

12 

13 

14 

15 

16 

Resul 

t 

0.021 

9 

0.024 

9 

0.022 

1 

0.022 

0 

0.022 

1 

0.022 

6 

0.023 

1 

0.024 

7 

Exp. 

# 

17 

18 

19 

20 

21 

22 

23 

24 

Resul 

t 

0.021 

5 

0.021 

8 

0.020 

1 

0.019 

9 

0.021 

0 

0.021 

6 

0.019 

8 

0.020 

1 

Table  3:  Results  of  co  using  data  obtained  from  camera.  True  co  is  0.021  rad/s. 


4:  Conclusions  and  Future  work 

In  this  paper,  it  has  been  shown  that  during  the  process  of  camera  fixation  there  is  a scene 
independent  visual  motion  invariant.  The  result  is  stated  in  a closed  form,  and  can  be  obtained  using  the 
optical  flow  of  only  two  points  at  two  different  time  instants.  The  results  of  the  described  experiments, 
obtained  from  both  simulated  and  real  data,  are  highly  encouraging.  Currently  we  are  extending  the 
method  to  a general  motion  and  textured  environments  using  a feature  tracking  method  [10]. 
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